1,782 research outputs found

    Collaborative Spatio-temporal Feature Learning for Video Action Recognition

    Full text link
    Spatio-temporal feature learning is of central importance for action recognition in videos. Existing deep neural network models either learn spatial and temporal features independently (C2D) or jointly with unconstrained parameters (C3D). In this paper, we propose a novel neural operation which encodes spatio-temporal features collaboratively by imposing a weight-sharing constraint on the learnable parameters. In particular, we perform 2D convolution along three orthogonal views of volumetric video data,which learns spatial appearance and temporal motion cues respectively. By sharing the convolution kernels of different views, spatial and temporal features are collaboratively learned and thus benefit from each other. The complementary features are subsequently fused by a weighted summation whose coefficients are learned end-to-end. Our approach achieves state-of-the-art performance on large-scale benchmarks and won the 1st place in the Moments in Time Challenge 2018. Moreover, based on the learned coefficients of different views, we are able to quantify the contributions of spatial and temporal features. This analysis sheds light on interpretability of the model and may also guide the future design of algorithm for video recognition.Comment: CVPR 201

    A Layer Decomposition-Recomposition Framework for Neuron Pruning towards Accurate Lightweight Networks

    Full text link
    Neuron pruning is an efficient method to compress the network into a slimmer one for reducing the computational cost and storage overhead. Most of state-of-the-art results are obtained in a layer-by-layer optimization mode. It discards the unimportant input neurons and uses the survived ones to reconstruct the output neurons approaching to the original ones in a layer-by-layer manner. However, an unnoticed problem arises that the information loss is accumulated as layer increases since the survived neurons still do not encode the entire information as before. A better alternative is to propagate the entire useful information to reconstruct the pruned layer instead of directly discarding the less important neurons. To this end, we propose a novel Layer Decomposition-Recomposition Framework (LDRF) for neuron pruning, by which each layer's output information is recovered in an embedding space and then propagated to reconstruct the following pruned layers with useful information preserved. We mainly conduct our experiments on ILSVRC-12 benchmark with VGG-16 and ResNet-50. What should be emphasized is that our results before end-to-end fine-tuning are significantly superior owing to the information-preserving property of our proposed framework.With end-to-end fine-tuning, we achieve state-of-the-art results of 5.13x and 3x speed-up with only 0.5% and 0.65% top-5 accuracy drop respectively, which outperform the existing neuron pruning methods.Comment: accepted by AAAI19 as ora

    Primary User Emulation Detection in Cognitive Radio Networks

    Get PDF
    Cognitive radios (CRs) have been proposed as a promising solution for improving spectrum utilization via opportunistic spectrum sharing. In a CR network environment, primary (licensed) users have priority over secondary (unlicensed) users when accessing the wireless channel. Thus, if a malicious secondary user exploits this spectrum access etiquette by mimicking the spectral characteristics of a primary user, it can gain priority access to a wireless channel over other secondary users. This scenario is referred to in the literature as primary user emulation (PUE). This dissertation first covers three approaches for detecting primary user emulation attacks in cognitive radio networks, which can be classified in two categories. The first category is based on cyclostationary features, which employs a cyclostationary calculation to represent the modulation features of the user signals. The calculation results are then fed into an artificial neural network for classification. The second category is based on video processing method of action recognition in frequency domain, which includes two approaches. Both of them analyze the FFT sequences of wireless transmissions operating across a cognitive radio network environment, as well as classify their actions in the frequency domain. The first approach employs a covariance descriptor of motion-related features in the frequency domain, which is then fed into an artificial neural network for classification. The second approach is built upon the first approach, but employs a relational database system to record the motion-related feature vectors of primary users on this frequency band. When a certain transmission does not have a match record in the database, a covariance descriptor will be calculated and fed into an artificial neural network for classification. This dissertation is completed by a novel PUE detection approach which employs a distributed sensor network, where each sensor node works as an independent PUE detector. The emphasis of this work is how these nodes collaborate to obtain the final detection results for the whole network. All these proposed approaches have been validated via computer simulations as well as by experimental hardware implementations using the Universal Software Radio Peripheral (USRP) software-defined radio (SDR) platform

    Frequency Rendezvous and Physical Layer Network Coding for Distributed Wireless Networks

    Get PDF
    In this thesis, a transmission frequency rendezvous approach for secondary users deployed in decentralized dynamic spectrum access networks is proposed. Frequency rendezvous is a critical step in bootstrapping a wireless network that does not possess centralized control. Current techniques for enabling frequency rendezvous in decentralized dynamic spectrum access networks either require pre-existing infrastructure or use one of several simplifying assumptions regarding the architecture, such as the use of regularly spaced frequency channels for communications. Our proposed approach is designed to be operated in a strictly decentralized wireless networking environment, where no centralized control is present and the spectrum does not possess pre-defined channels. In our proposed rendezvous algorithm, the most important step is pilot tone detection and receiver query. In order to realize a shortest search time for the target receiver, an efficient scanning rule should be employed. In this thesis, three scanning rules are proposed and evaluated, namely: frequency sequence scanning, pilot tone strength scanning, and cluster scanning. To validate our result, we test our scanning rules with actual paging band spectrum measurements. Previous research on security of network coding focuses on the protection of data dissemination procedures and the detection of malicious activities such as pollusion attacks. The capabilities of network coding to detect other attacks has not been fully explored. In this thesis, a new mechanism based on physical layer network coding to detect wormhole attacks is proposed. When two signal sequences collide at the receiver, the difference between the two received sequences is determined by its distances to the senders. Therefore, by comparing the differences between the received sequences at two nodes, we can estimate the distance between them and detect those fake neighbor connections through wormholes. While the basic idea is clear, we design many schemes at both physical and network layers to turn the idea into a practical approach. Simulations using BPSK modulation at the physical layer show that the wireless nodes can effectively detect fake neighbor connections without the adoption of any special hardware on them
    • …
    corecore